Skip to content

Blob storage artifact-data keys, binary sanitizer, and observability stripping#2744

Open
mike-inkeep wants to merge 12 commits intomainfrom
stack/artifact_binary_sanitizer
Open

Blob storage artifact-data keys, binary sanitizer, and observability stripping#2744
mike-inkeep wants to merge 12 commits intomainfrom
stack/artifact_binary_sanitizer

Conversation

@mike-inkeep
Copy link
Copy Markdown
Contributor

@mike-inkeep mike-inkeep commented Mar 18, 2026

Summary

Adds an artifact binary sanitizer that offloads inline binary data (images, files) embedded in artifact payloads to blob storage, and strips those payloads from observability spans before they reach the tracing backend. Also introduces a dedicated artifact-data storage key category, fixes Vercel file part mapping to use url instead of text, and adds inlineExternalPdfUrlParts to support downloading external PDF URLs as inline bytes.

Key decisions

Sanitizer runs at persist time, not at generation time. Artifact data is sanitized when it is written to storage, so any consumer reading back blob URIs gets a stable reference. An alternative (sanitizing at read time) was not chosen because it would require every read path to do async work.

Content-hash-based artifact-data storage keys. The key structure is v1/t_{tenantId}/artifact-data/p_{projectId}/a_{artifactId}/sha256-{hash}.{ext}. Using a content hash means duplicate binary payloads within the same artifact are deduplicated automatically, and keys are deterministic without needing to generate UUIDs.

stripBinaryDataForObservability is synchronous and non-destructive. Rather than uploading to blob storage before writing spans (which would add latency and an async dependency to the span-writing hot path), binary parts are replaced with a [binary data ~N bytes, mimeType: X] placeholder. This keeps span sizes small without blocking.

inlineExternalPdfUrlParts fetches PDFs before the upload pipeline. Previously, external PDF URIs in file parts were explicitly blocked. This helper downloads them early (in buildPersistedMessageContent) and converts to inline bytes so the rest of the upload pipeline can process them uniformly.

Manual QA

No manual QA performed. Consider running /qa to generate and execute a test plan.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 18, 2026

⚠️ No Changeset found

Latest commit: 7dd41d8

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 26, 2026 2:01am
agents-manage-ui Ready Ready Preview, Comment Mar 26, 2026 2:01am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
agents-docs Skipped Skipped Mar 26, 2026 2:01am

Request Review

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(6) Total Issues | Risk: Medium

🟠⚠️ Major (2) 🟠⚠️

🟠 1) artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures

Issue: The uploadInlinePart function calls storage.upload() without a try-catch block. If the blob storage upload fails (network error, quota exceeded, permission denied), the promise rejection propagates and could fail the entire artifact processing pipeline.

Why: The existing pattern in image-upload.ts (lines 90-98) gracefully catches upload failures and drops the problematic part while preserving the rest of the content. This inconsistency creates a reliability gap where a transient storage failure could cause complete artifact processing failure rather than graceful degradation.

Fix: Add error handling around the upload call that either (1) returns the original part with base64 data intact as a fallback, or (2) replaces the data with an error placeholder. The first option preserves data at the cost of larger observability payloads; the second maintains observability cleanliness but loses the data.

Refs:


🟠 2) sanitizeArtifactBinaryData Exported function is not used in production code

Issue: The PR adds two functions: stripBinaryDataForObservability (used in 4 places in AgentSession.ts) and sanitizeArtifactBinaryData (an async function that uploads binary data to blob storage). However, sanitizeArtifactBinaryData is exported and tested but never imported or called anywhere in the codebase outside its test file.

Why: This means the artifact-data storage key category is added but no code path actually uploads artifacts using it. The infrastructure exists but is not wired up to any data flow. This may be intentional preparatory work, but it's unclear from the PR.

Fix: Either (1) defer adding sanitizeArtifactBinaryData until the consuming code is ready, (2) document in the PR description that this is preparatory infrastructure with follow-up work planned, or (3) wire up the function to the appropriate data flow (e.g., artifact persistence in ArtifactService).

Refs:

Inline Comments:

  • 🟠 Major: artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures

🟡 Minor (4) 🟡

🟡 1) storage-keys.ts:37-45 Missing unit tests for new artifact-data storage key category

Issue: The existing storage-keys.test.ts tests buildStorageKey for the media category but was not updated to test the new artifact-data category.
Why: The storage key format is critical for data integrity and multi-tenant isolation. A regression in the key format could cause lookup failures or tenant data leakage.
Fix: Add a test case in storage-keys.test.ts:

it('builds versioned artifact-data storage key', () => {
  const key = buildStorageKey({
    category: 'artifact-data',
    tenantId: 'tenant-1',
    projectId: 'project-1',
    artifactId: 'artifact-1',
    contentHash: 'abc123',
    ext: 'png',
  });
  expect(key).toBe('v1/t_tenant-1/artifact-data/p_project-1/a_artifact-1/sha256-abc123.png');
});

🟡 2) artifact-binary-sanitizer.ts:27 Magic number threshold for binary detection

Issue: The 100 character threshold for detecting binary data is hardcoded without explanation.
Why: This represents only ~75 decoded bytes. Without documentation, it's unclear if this was deliberately chosen or if a larger threshold would be more appropriate.
Fix: Extract to a named constant: const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes

🟡 3) artifact-binary-sanitizer.test.ts Missing test for data: URI handling

Issue: The implementation checks !v.data.startsWith('data:') but there's no explicit test.
Why: Data URIs are common for inline images; without a test, this check could be accidentally removed.
Fix: Add test case for data URIs.

🟡 4) artifact-binary-sanitizer.test.ts Missing test for upload failure scenario

Issue: Tests mock upload to always succeed but never test the failure path.
Why: Masks the missing error handling in the implementation.
Fix: Add failure test case.

Inline Comments:

  • 🟡 Minor: artifact-binary-sanitizer.ts:19 Magic number threshold lacks documentation
  • 🟡 Minor: artifact-binary-sanitizer.test.ts:109 Missing test for upload failure handling
  • 🟡 Minor: artifact-binary-sanitizer.test.ts:57 Missing test for data: URI handling

💭 Consider (2) 💭

💭 1) storage-keys.ts Missing companion parse/prefix functions for artifact-data category

Issue: The media category has parseMediaStorageKey and buildMediaStorageKeyPrefix helpers, but artifact-data only has the buildStorageKey case.
Why: May be needed later for cleanup, authorization checks, or debugging. If not needed, this is acceptable.
Fix: Add a comment explaining why parse/prefix helpers are not needed, or add them for consistency.

💭 2) AgentSession.ts Pretty-print JSON in span attributes

Issue: Uses JSON.stringify(..., null, 2) for span attributes which adds unnecessary whitespace.
Why: Span attributes are stored and indexed; compact format reduces storage overhead.
Fix: Remove null, 2 arguments: JSON.stringify(stripBinaryDataForObservability(artifactData.data))


💡 APPROVE WITH SUGGESTIONS

Summary: This PR adds valuable infrastructure for handling binary data in artifacts. The stripBinaryDataForObservability function is correctly integrated and will improve observability by preventing large binary payloads in spans. However, two issues warrant attention: (1) the upload function lacks error handling unlike the existing image-upload.ts pattern, and (2) sanitizeArtifactBinaryData appears to be unused — clarification on whether this is intentional preparatory work would be helpful. Adding tests for the new storage key category would also strengthen the PR.

Discarded (3)
Location Issue Reason Discarded
artifact-binary-sanitizer.test.ts Missing test for missing mimeType fallback Low severity (INFO level); the behavior is implicitly covered and mimeType presence is common
artifact-binary-sanitizer.test.ts Missing boundary test at exactly 100 chars Nice-to-have but unlikely to cause real bugs; current tests cover above/below threshold adequately
AgentSession.ts spans Redundant JSON.stringify pretty-print Moved to Consider as it's a minor optimization, not a correctness issue
Reviewers (6)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-precision 3 1 1 0 0 0 1
pr-review-tests 6 2 0 0 2 0 2
pr-review-sre 4 1 1 0 1 0 1
pr-review-consistency 2 1 0 0 0 0 1
pr-review-standards 0 0 0 0 0 0 0
pr-review-types 0 0 0 0 0 0 0
Total 15 5 2 0 3 0 5

Note: Some findings were merged across reviewers (e.g., error handling raised by both tests and SRE reviewers).

ext,
});

await storage.upload({ key, data: buffer, contentType: mimeType });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Missing error handling for storage upload failures

Issue: The uploadInlinePart function calls storage.upload() without a try-catch. If the upload fails (network error, quota exceeded, permission denied), the error propagates and could fail the entire artifact processing pipeline.

Why: The existing pattern in image-upload.ts (lines 90-98) catches upload failures gracefully. This inconsistency could cause unhandled errors and partial artifact state when storage is temporarily unavailable.

Fix: Consider wrapping the upload in a try-catch:

try {
  await storage.upload({ key, data: buffer, contentType: mimeType });
  return { ...part, data: toBlobUri(key) };
} catch (error) {
  // Option 1: Return original part (fallback to base64)
  return part;
  // Option 2: Return placeholder
  // return { ...part, data: `[upload failed, binary ~${buffer.length} bytes]` };
}

Refs:

[key: string]: unknown;
};

const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Magic number threshold lacks documentation

Issue: The 100 character threshold for detecting binary data is hardcoded without explanation. This represents only ~75 decoded bytes, which may be smaller than intended.

Why: Without documentation, it's unclear whether this threshold was deliberately chosen or if a larger value (e.g., 1KB) would better distinguish binary content from short encoded strings.

Fix:

Suggested change
const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]';
const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes; data below this is likely metadata, not binary payload

Then use the constant in the check below.

download: vi.fn(),
delete: vi.fn(),
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test for upload failure handling

Issue: The test suite mocks upload to always succeed but never tests the failure path.

Why: Given that image-upload.test.ts has a test for 'drops file part when storage.upload throws' (line 148), this gap masks the missing error handling in the implementation.

Fix: Add a test case:

it('propagates upload errors when blob storage fails', async () => {
  mockUpload.mockRejectedValueOnce(new Error('Storage quota exceeded'));
  const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
  await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('Storage quota exceeded');
});

Refs:

it('leaves http URLs untouched', () => {
const input = { type: 'image', data: 'https://example.com/img.png', mimeType: 'image/png' };
const result = stripBinaryDataForObservability(input) as any;
expect(result.data).toBe('https://example.com/img.png');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test for data: URI handling

Issue: Line 30 of the implementation checks !v.data.startsWith('data:') but there's no explicit test verifying data URIs are left untouched.

Why: Data URIs are a common format for inline images. Without a test, a refactor could accidentally remove the check, causing data URIs to be re-uploaded unnecessarily.

Fix: Add test cases:

it('leaves data: URIs untouched', () => {
  const dataUri = 'data:image/png;base64,' + LARGE_BASE64;
  const input = { type: 'image', data: dataUri, mimeType: 'image/png' };
  const result = stripBinaryDataForObservability(input) as any;
  expect(result.data).toBe(dataUri);
});

Base automatically changed from stack/history_shape to main March 25, 2026 20:07
@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Mar 25, 2026

TL;DR — Adds infrastructure for offloading inline binary data (images, files) from artifact payloads to blob storage, and strips that binary data from observability spans so traces and logs don't balloon with base64 content.

Key changes

  • Add artifact-data category to buildStorageKey — extends the storage-key builder with a new ArtifactDataKeyInput discriminant that produces content-addressed keys scoped to tenant, project, and artifact.
  • New artifact-binary-sanitizer modulesanitizeArtifactBinaryData recursively walks arbitrary artifact data, detects inline base64 image/file parts, uploads them to blob storage, and replaces the data field with a blob:// URI. stripBinaryDataForObservability does the same walk but swaps data with a human-readable placeholder instead of uploading.
  • Handle data: URIs and graceful upload failures — the sanitizer parses data: URI strings to extract the embedded mime type and base64 payload before upload, and catches storage errors to return the original inline data rather than throwing.
  • Strip binary data from AgentSession observability attributes — four call sites in AgentSession.ts that serialize artifactData.data into span attributes now wrap the value in stripBinaryDataForObservability to keep traces lightweight.
  • Export parseDataUri from message-parts.ts — promotes an existing helper to a shared export so the binary sanitizer can reuse it.
  • Comprehensive tests for both sanitizer functions — covers upload-and-replace, data: URI extraction, idempotent blob URIs, graceful upload failure fallback, recursive nested structures, circular-reference safety, and observability placeholder formatting.

Summary | 6 files | 12 commits | base: mainstack/artifact_binary_sanitizer


Content-addressed blob storage keys for artifact data

Before: buildStorageKey only supported conversation-media keys scoped to conversation/message.
After: A new artifact-data category produces keys like v1/t_{tenant}/artifact-data/p_{project}/a_{artifact}/sha256-{hash}.{ext}, enabling content-addressed deduplication per artifact.

The ArtifactDataKeyInput type is added as a union member to StorageKeyInput, and the switch in buildStorageKey gains a new case.

storage-keys.ts


Binary sanitizer for artifact data

Before: Inline base64 blobs in artifact data were stored and transmitted as-is — no offloading, no size reduction.
After: sanitizeArtifactBinaryData detects inline image/file parts (any data string that isn't a blob:// or http URL), uploads them to blob storage via content hash, and replaces data with a blob:// URI. Already-uploaded or URL-based data is left untouched.

The detection heuristic (isInlineBinaryPart) checks for type: 'image' | 'file', a non-empty data string, and excludes blob:// and http URLs — there is no minimum-length threshold. data: URIs are parsed via parseDataUri to extract the embedded mime type and raw base64 payload before hashing and upload. Both the upload sanitizer and the observability stripper share this heuristic and use a WeakSet-based cycle guard for circular references.

Upload failures are caught and logged with structured context (tenant, project, artifact, mime type, size) — the original inline data is returned unchanged, keeping the pipeline resilient to storage outages.

How does the observability stripper differ from the upload sanitizer?

stripBinaryDataForObservability is synchronous and never calls blob storage — it replaces detected binary data with a placeholder string like [binary data ~150 bytes, mimeType: image/png]. This keeps span attributes and log payloads small without requiring any I/O.

artifact-binary-sanitizer.ts · artifact-binary-sanitizer.test.ts · message-parts.ts


Observability stripping in AgentSession

Before: artifactData.data was serialized directly into span attributes via JSON.stringify with pretty-printing, embedding potentially megabytes of base64 into every trace.
After: All four serialization sites wrap the value in stripBinaryDataForObservability before stringifying, and drop the pretty-printing (null, 2) since span attributes don't benefit from it.

The affected call sites are the artifact receipt span, the artifact data preview for context truncation, and both artifact summary spans in the tool-result handling paths.

AgentSession.ts

Pullfrog  | View workflow run | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(6) Total Issues | Risk: Medium

Delta Since Last Review

The only change since the prior review is a single import path fix:

-import { getExtensionFromMimeType } from '@inkeep/agents-core/constants/allowed-image-formats';
+import { getExtensionFromMimeType } from '@inkeep/agents-core/constants/allowed-file-formats';

This is a correct fix — the module was renamed from allowed-image-formats to allowed-file-formats in main. ✅


🕐 Pending Recommendations (6)

The following issues were raised in the prior review and remain unaddressed:

Major

  • 🟠 artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures — uploadInlinePart lacks try-catch around storage.upload(), inconsistent with file-upload.ts pattern
  • 🟠 sanitizeArtifactBinaryData Exported function is unused in production code — infrastructure exists but not wired to any data flow

Minor


🚫 REQUEST CHANGES

Summary: The delta since the prior review is a trivial import path fix which is correct. However, the 2 Major and 4 Minor issues identified in the previous review remain unaddressed. The most critical gaps are: (1) missing error handling for storage upload failures which creates inconsistency with the existing file-upload.ts pattern and could cause complete artifact processing failure on transient storage errors, and (2) sanitizeArtifactBinaryData being exported but unused, suggesting incomplete integration or preparatory infrastructure that should be documented.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
N/A — delta review 0 0 0 0 0 6 0

Note: Domain-specific reviewers were not dispatched for this re-review because the delta consists only of a single import path fix. All 6 pending recommendations are carried forward from the prior review.

@github-actions github-actions bot deleted a comment from claude bot Mar 25, 2026
@itoqa
Copy link
Copy Markdown

itoqa bot commented Mar 26, 2026

Ito Test Report ❌

17 test cases ran. 1 failed, 16 passed.

Overall, 17 test cases were executed with 16 passing and 1 failing, showing strong stability across playground chat submission/double-send gating, traces conversation/history routing, outage and navigation resilience, artifact sanitization/copy-download behavior, mobile rendering, session invalidation, and authorization/large-payload protections. The key issue is a Medium-severity traces UI bug where untrusted trace text is rendered via markdown (Streamdown) in timeline/detail views, causing script-tag rendering instability instead of consistently inert plain-text handling and reducing observability reliability.

❌ Failed (1)
Category Summary Screenshot
Adversarial 🟠 Traces UI renders untrusted trace text via markdown, producing script-tag rendering instability instead of consistently inert text handling. ADV-1
🟠 XSS payload in artifact fields is not executed
  • What failed: Untrusted trace text is rendered through Streamdown markdown components, and script-tag payloads cause rendering errors/instability instead of being handled as inert plain text.
  • Impact: Malformed or adversarial trace content can destabilize traces rendering and degrade operator visibility during debugging. This weakens reliability of the observability UI for affected conversations.
  • Introduced by this PR: No – pre-existing bug (code not changed in this PR)
  • Steps to reproduce:
    1. Open a traces conversation containing tool/message fields with <script>-style payload text.
    2. Navigate the timeline rows and message detail panel that render trace text fields.
    3. Observe rendering instability/errors instead of strictly inert plain-text output.
  • Code analysis: I inspected traces activity shaping in the conversation API route and the UI timeline/detail renderers. The route builds descriptions from span attributes (including tool names), and UI components pass those strings directly into Streamdown without explicit sanitization or plain-text escaping on these paths.
  • Why this is likely a bug: Production code combines span-derived text with markdown rendering in the traces UI, creating a plausible and repeatable unsafe-rendering path for adversarial strings.

Relevant code:

agents-manage-ui/src/components/traces/timeline/timeline-item.tsx (lines 260-268)

<button
  type="button"
  onClick={onSelect}
  className={`flex items-center gap-1 group cursor-pointer transition-colors duration-200 ${textColorClass}`}
  title="Click to view details"
>
  <span className="font-medium">
    <Streamdown>{activity.description}</Streamdown>
  </span>

agents-manage-ui/src/components/traces/timeline/render-panel-content.tsx (lines 262-265)

<LabeledBlock label="Message content">
  <Bubble className="break-words">
    <Streamdown>{a.messageContent || 'Message content not available'}</Streamdown>
  </Bubble>
</LabeledBlock>

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts (lines 765-771)

const toolCall = getString(span, SPAN_KEYS.SPAN_ID, '');
activities.push({
  id: toolCall,
  type: ACTIVITY_TYPES.TOOL_CALL,
  toolName: name,
  description: hasError && statusMessage ? `Tool ${name} failed` : `Called ${name}`,
  timestamp: span.timestamp,
✅ Passed (16)
Category Summary Screenshot
Adversarial Tampered deep-link requests remain tenant/project scoped; no cross-scope payload exposure was verified. ADV-2
Adversarial Large inline binary-like artifact data is sanitized/stripped, with oversized retrieval warnings preserved for traces UI. ADV-4
Adversarial Cross-tab logout invalidated the other tab session and reloading redirected to login as expected. ADV-5
Edge Rapid 5x send attempts were gated to one submission and the chat UI remained stable. EDGE-1
Edge Mobile traces detail flow at 390x844 remained readable and interactive with no horizontal overflow. EDGE-3
Edge Short binary-like artifact data remains visible without over-redaction. EDGE-4
Edge Circular placeholder text stays inert and export flows keep valid JSON. EDGE-5
Edge Oversized artifact warning and metadata remain visible and accurate. EDGE-6
Edge Rapid side-panel open/close interactions remain stable and keep artifact details synchronized. EDGE-7
Happy-path Playground prompt submission completed and assistant output rendered successfully. ROUTE-1
Happy-path Trace conversation route includes artifact_processing query/mapping and handles SigNoz connectivity errors gracefully; no product logic defect found. ROUTE-3
Happy-path Deep-linked traces conversation detail handled forced backend outage with a stable error state across reloads and no crash. ROUTE-4
Happy-path Conversation history route behavior remains coherent; route-level integration tests pass for artifact-heavy message shaping. ROUTE-5
Screen Artifact processing timeline card renders sanitized artifact data with binary placeholder in local UI fixture flow. SCREEN-1
Screen Artifact details panel copy/download actions export sanitized JSON without raw binary leakage. SCREEN-2
Screen Non-linear traces navigation preserved conversation context via URL-scoped conversation routes and rehydration on refresh/deep-link flows. SCREEN-3

Commit: 4c9a209

View Full Run


Tell us how we did: Give Ito Feedback

This function is intentionally exported for use in artifact persistence (see PR #2745).
The observability-only stripping path is `stripBinaryDataForObservability`.
export async function sanitizeArtifactBinaryData()
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Since Last Review

The author addressed all 6 prior findings from the previous review:

Prior Finding Status Implementation
🟠 Missing error handling for upload FIXED Try-catch added with graceful fallback (lines 58-73)
🟠 Unused sanitizeArtifactBinaryData Understood Intentional preparatory infrastructure
🟡 Missing artifact-data storage key test FIXED Test added (storage-keys.test.ts:61-72)
🟡 Magic number (100) threshold RESOLVED Changed to > 1 — now detects all binary data
🟡 Missing data: URI test FIXED Tests added for both stripping (lines 53-61) and sanitization (lines 193-206)
🟡 Missing upload failure test FIXED Test added (lines 208-214)

New Changes in Delta

The delta commit (f5f09e008) adds:

  1. parseDataUri integration — Data URIs are now parsed and their base64 payloads extracted (mimeType from URI takes precedence)
  2. Robust error handling — Upload failures log comprehensive context and gracefully fall back to original inline data
  3. Pretty-print removal — Removed null, 2 from JSON.stringify calls in span attributes (compact format)
  4. Comprehensive test coverage — All prior gaps addressed with well-structured tests

💭 Consider (1) 💭

Inline Comments:

  • 💭 Consider: artifact-binary-sanitizer.test.ts:22 Missing boundary test for threshold change from > 100 to > 1

🕐 Pending Recommendations (1)

  • 🟠 sanitizeArtifactBinaryData Function is exported and tested but unused in production — understood as preparatory infrastructure for a follow-up PR that wires it into artifact persistence

✅ APPROVE

Summary: Excellent follow-up work. All prior feedback has been addressed with well-implemented solutions. The error handling pattern (try-catch with graceful fallback to original data) is consistent with existing patterns in the codebase. The sanitizeArtifactBinaryData function being unused is understood as preparatory infrastructure — the observability stripping via stripBinaryDataForObservability is the production-ready piece that's correctly integrated into AgentSession.ts. Ship it! 🚀

Discarded (2)
Location Issue Reason Discarded
artifact-binary-sanitizer.ts:45 Buffer.from doesn't validate base64 input Edge case — image/file parts from LLM tool results should be valid base64; any corruption would be visible when artifact fails to render
artifact-binary-sanitizer.test.ts:193 Missing test for malformed data URIs Low priority edge case — parseDataUri returns null, code safely falls back to treating entire string as base64
Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 0 0 0 0 0 0 0
pr-review-tests 2 0 0 0 1 0 1
pr-review-errors 3 0 0 0 0 0 1
Total 5 0 0 0 1 0 2

Note: Most reviewer findings were INFO-level acknowledgments of correctly implemented prior feedback.

),
}));

const SAMPLE_BASE64 = Buffer.from('sample image bytes').toString('base64');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Consider: Missing boundary test for threshold change

Issue: The threshold changed from > 100 to > 1, but no test verifies this boundary. SAMPLE_BASE64 (~24 chars) passes both thresholds.

Why: Without a boundary test, a future refactor could accidentally change > 1 to >= 1 or back to > 100 without test failure.

Fix: Consider adding boundary tests:

it('treats 2-char data string as binary part', () => {
  const input = { type: 'image', data: 'AA', mimeType: 'image/png' };
  const result = stripBinaryDataForObservability(input) as any;
  expect(result.data).toMatch(/^\[binary data/);
});

it('does not treat 1-char data string as binary part', () => {
  const input = { type: 'image', data: 'A', mimeType: 'image/png' };
  const result = stripBinaryDataForObservability(input) as any;
  expect(result.data).toBe('A');
});

@github-actions github-actions bot deleted a comment from claude bot Mar 26, 2026
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Since Last Review

The only change since the prior review is commit 7dd41d83b ("Remove unnecessary check"):

-    v.data.length > 1 &&

This removes the minimum length check from isInlineBinaryPart, so now any non-empty string data (that isn't a blob:// or http URL) is treated as a binary part.

Analysis

Previous state: v.data.length > 1 — strings of 0 or 1 characters were excluded
New state: No length check — any non-empty string is included

Is this correct? ✅ Yes. The removed check was overly restrictive:

  1. Single-character base64 data is valid (though rare in practice)
  2. Empty strings are already implicitly excluded — if data is '', it would still match typeof v.data === 'string', but:
    • stripBinaryDataForObservability would produce [binary data ~0 bytes] — harmless
    • sanitizeArtifactBinaryData would upload a 0-byte file — also harmless (content-hash deduplication handles this)
  3. The prior "Consider" item about missing boundary tests is now moot — the simpler heuristic doesn't need boundary testing

Impact: Simplifies the detection heuristic with no functional regression. Edge cases (empty/tiny strings) are handled safely.

🕐 Pending Recommendations (1)

  • 🟠 sanitizeArtifactBinaryData Function is exported and tested but unused in production — understood as preparatory infrastructure for a follow-up PR

✅ APPROVE

Summary: Clean simplification. Removing the > 1 length check makes the binary detection heuristic more straightforward without introducing regressions. The prior review's APPROVE stands — all feedback has been addressed, and this final tweak is a sensible cleanup. 🚀

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
N/A — trivial delta review 0 0 0 0 0 1 0

Note: Domain-specific reviewers were not dispatched because the delta consists only of a single line removal. The pending recommendation is carried forward from prior reviews.

@itoqa
Copy link
Copy Markdown

itoqa bot commented Mar 26, 2026

Ito Test Report ✅

12 test cases ran. 12 passed.

Across the unified QA run, all 12 executed test cases passed with 0 failures, and verification found no confirmed or plausible production-code defects. Most importantly, run-route handling worked for inline and nested mixed-content inputs with successful HTTP 200 SSE streaming, external/failed PDF ingestion paths consistently returned controlled HTTP 400 responses with URL token/query/hash redaction and fallback behavior (no 5xx), and traces/artifact and UI stress checks (sanitized payload placeholders with preserved mimeType metadata, clean JSON downloads, mobile/back-forward/deep-link/filter abuse) remained stable and usable despite recoverable telemetry configuration errors.

✅ Passed (12)
Category Summary Screenshot
Adversarial Verified that PDF URL ingestion errors sanitize source URLs by stripping query/hash values, preventing sensitive token leakage in error surfaces. N/A
Adversarial Traces UI remained usable through deep-link query fuzzing, rapid filter toggles, back/forward loops, and iPhone 12 viewport checks. ADV-6
Edge Nested mixed-content payload parsing and SSE streaming completed successfully in re-verification. EDGE-1
Edge Mobile artifact panel usability remained intact for large scrubbed payload flows after re-investigation. EDGE-3
Edge Back/forward navigation loops preserved conversation artifact context without losing the JSON panel state. EDGE-4
Logic Re-verification confirmed inline base64/data URI payloads are replaced by binary-placeholder text before trace attributes are emitted. LOGIC-1
Logic Artifact trace JSON preserves mimeType context in sanitized placeholder output. LOGIC-2
Logic Downloaded artifact JSON remained sanitized and excluded raw binary/base64 signatures. LOGIC-3
Logic Mixed text plus failing remote file input produced a controlled fallback response without 5xx. LOGIC-4
Logic Persisted artifact data rendered the expected key shape (placeholder, mimeType, sanitized, originalBytes, reason) without raw payload leakage. LOGIC-5
Happy-path Inline PDF data URI request returned expected SSE streaming response (HTTP 200). ROUTE-1
Happy-path Unreachable external PDF URL returned a controlled HTTP 400 with sensitive token redaction. ROUTE-3

Commit: 7dd41d8

View Full Run


Tell us how we did: Give Ito Feedback

@tim-inkeep
Copy link
Copy Markdown
Contributor

@mike-inkeep does this only find binary data within artifacts that have been saved? what about the artifacts themselves being moved to blob storage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants